Search CORE

4,131 research outputs found

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

Author: Chen Da
Chen Qi
Deng Chaorui
Qin Pengda
Wu Qi
Publication venue
Publication date: 15/08/2023
Field of study

In text-video retrieval, recent works have benefited from the powerful learning capabilities of pre-trained text-image foundation models (e.g., CLIP) by adapting them to the video domain. A critical problem for them is how to effectively capture the rich semantics inside the video using the image encoder of CLIP. To tackle this, state-of-the-art methods adopt complex cross-modal modeling techniques to fuse the text information into video frame representations, which, however, incurs severe efficiency issues in large-scale retrieval systems as the video representations must be recomputed online for every text query. In this paper, we discard this problematic cross-modal fusion process and aim to learn semantically-enhanced representations purely from the video, so that the video representations can be computed offline and reused for different texts. Concretely, we first introduce a spatial-temporal "Prompt Cube" into the CLIP image encoder and iteratively switch it within the encoder layers to efficiently incorporate the global video semantics into frame representations. We then propose to apply an auxiliary video captioning objective to train the frame representations, which facilitates the learning of detailed video semantics by providing fine-grained guidance in the semantic space. With a naive temporal fusion strategy (i.e., mean-pooling) on the enhanced frame representations, we obtain state-of-the-art performances on three benchmark datasets, i.e., MSR-VTT, MSVD, and LSMDC.Comment: to be appeared in ICCV202

arXiv.org e-Print Archive

Identity-Consistent Aggregation for Video Object Detection

Author: Chen Da
Deng Chaorui
Wu Qi
Publication venue
Publication date: 15/08/2023
Field of study

In Video Object Detection (VID), a common practice is to leverage the rich temporal contexts from the video to enhance the object representations in each frame. Existing methods treat the temporal contexts obtained from different objects indiscriminately and ignore their different identities. While intuitively, aggregating local views of the same object in different frames may facilitate a better understanding of the object. Thus, in this paper, we aim to enable the model to focus on the identity-consistent temporal contexts of each object to obtain more comprehensive object representations and handle the rapid object appearance variations such as occlusion, motion blur, etc. However, realizing this goal on top of existing VID models faces low-efficiency problems due to their redundant region proposals and nonparallel frame-wise prediction manner. To aid this, we propose ClipVID, a VID model equipped with Identity-Consistent Aggregation (ICA) layers specifically designed for mining fine-grained and identity-consistent temporal contexts. It effectively reduces the redundancies through the set prediction strategy, making the ICA layers very efficient and further allowing us to design an architecture that makes parallel clip-wise predictions for the whole video clip. Extensive experimental results demonstrate the superiority of our method: a state-of-the-art (SOTA) performance (84.7% mAP) on the ImageNet VID dataset while running at a speed about 7x faster (39.3 fps) than previous SOTAs.Comment: to be appeared at ICCV202

arXiv.org e-Print Archive

Minimizing the number of edges in $K_{s,t}$ -saturated bipartite graphs

Author: Chakraborti Debsoumya
Chen Da Qi
Hasabnis Mihir
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2021
Field of study

This paper considers an edge minimization problem in saturated bipartite graphs. An

n

n

bipartite graph

G

H

-saturated if

G

does not contain a subgraph isomorphic to

H

but adding any missing edge to

G

creates a copy of

H

. More than half a century ago, Wessel and Bollob\'as independently solved the problem of minimizing the number of edges in

K_{(s,t)}

-saturated graphs, where

K_{(s,t)}

is the `ordered' complete bipartite graph with

s

vertices from the first color class and

t

from the second. However, the very natural `unordered' analogue of this problem was considered only half a decade ago by Moshkovitz and Shapira. When

s=t

, it can be easily checked that the unordered variant is exactly the same as the ordered case. Later, Gan, Kor\'andi, and Sudakov gave an asymptotically tight bound on the minimum number of edges in

K_{s,t}

-saturated

n

n

bipartite graphs, which is only smaller than the conjecture of Moshkovitz and Shapira by an additive constant. In this paper, we confirm their conjecture for

s=t-1

with the classification of the extremal graphs. We also improve the estimates of Gan, Kor\'andi, and Sudakov for general

s

and

t

, and for all sufficiently large

n

.Comment: Reflected minor suggestions from reviewer

arXiv.org e-Print Archive

IBS Publications Repository

Vertex Downgrading to Minimize Connectivity

Author: Aissi Hassene
Chen Da Qi
Ravi R.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
Publication date: 01/01/2020
Field of study

We consider the problem of interdicting a directed graph by deleting nodes with the goal of minimizing the local edge connectivity of the remaining graph from a given source to a sink. We introduce and study a general downgrading variant of the interdiction problem where the capacity of an arc is a function of the subset of its endpoints that are downgraded, and the goal is to minimize the downgraded capacity of a minimum source-sink cut subject to a node downgrading budget. This models the case when both ends of an arc must be downgraded to remove it, for example. For this generalization, we provide a bicriteria (4,4)-approximation that downgrades nodes with total weight at most 4 times the budget and provides a solution where the downgraded connectivity from the source to the sink is at most 4 times that in an optimal solution. We accomplish this with an LP relaxation and rounding using a ball-growing algorithm based on the LP values. We further generalize the downgrading problem to one where each vertex can be downgraded to one of k levels, and the arc capacities are functions of the pairs of levels to which its ends are downgraded. We generalize our LP rounding to get a (4k,4k)-approximation for this case

Dagstuhl Research Online Publication Server

Plasma exosomal microRNAs are non-invasive biomarkers of moyamoya disease: A pilot study

Author: Chen Meng
Huang Da
Qi Hui
Yang Hongchun
Publication venue: Hospital das Clínicas, Faculdade de Medicina, Universidade de São Paulo
Publication date: 05/07/2023
Field of study

Background: As a progressive cerebrovascular disease, Moyamoya Disease (MMD) is a common cause of stroke in children and adults. However, the early biomarkers and pathogenesis of MMD remain poorly understood. Methods and material: This study was conducted using plasma exosome samples from MMD patients. Next-generation high-throughput sequencing, real-time quantitative PCR, gene ontology analysis, and Kyoto Encyclopaedia of Genes and Genomes pathway analysis of ideal exosomal miRNAs that could be used as potential biomarkers of MMD were performed. The area under the Receiver Operating Characteristic (ROC) curve was used to evaluate the sensitivity and specificity of biomarkers for predicting events. Results: Exosomes were successfully isolated and miRNA-sequence analysis yielded 1,002 differentially expressed miRNAs. Functional analysis revealed that they were mainly enriched in axon guidance, regulation of the actin cytoskeleton and the MAPK signaling pathway. Furthermore, 10 miRNAs (miR-1306-5p, miR-196b-5p, miR-19a-3p, miR-22-3p, miR-320b, miR-34a-5p, miR-485-3p, miR-489-3p, miR-501-3p, and miR-487-3p) were found to be associated with the most sensitive and specific pathways for MMD prediction. Conclusions: Several plasma secretory miRNAs closely related to the development of MMD have been identified, which can be used as biomarkers of MMD and contribute to differentiating MMD from non-MMD patients before digital subtraction angiography

Cadernos Espinosanos (E-Journal)

Cyclically 5-Connected Graphs

Author: Chen Da Qi
Publication venue: 'University of Waterloo'
Publication date: 24/08/2016
Field of study

Tutte's Four-Flow Conjecture states that every bridgeless, Petersen-free graph admits a nowhere-zero 4-flow. This hard conjecture has been open for over half a century with no significant progress in the first forty years. In the recent decades, Robertson, Thomas, Sanders and Seymour has proved the cubic version of this conjecture. Their strategy involved the study of the class of cyclically 5-connected cubic graphs. It turns out a minimum counterexample to the general Four-Flow Conjecture is also cyclically 5-connected. Motivated by this fact, we wish to find structural properties of this class in hopes of producing a list of minor-minimal cyclically 5-connected graphs

University of Waterloo's Institutional Repository

5-Hydroxy-1-(3-hydroxy-2-naphthoyl)-3,5-dimethyl-2-pyrazoline

Author: Da-Cheng Li
Da-Qi Wang
Dou
Moon
Mukhopadhyay
Sheldrick
Yuehua Zhu
Yuting Chen
Publication venue: International Union of Crystallography
Publication date: 01/08/2008
Field of study

In the title molecule, C16H16N2O3, intramolecular O—H⋯O hydrogen bonds influence the molecular conformation. Intermolecular O—H⋯O hydrogen bonds [O⋯O = 2.922 (2) Å] link the molecules into centrosymmetric dimers. Weak intermolecular C—H⋯O interactions assemble these dimers into layers parallel to the bc plane

Crossref

Directory of Open Access Journals

PubMed Central